Assessing the impact of ABFT & Checkpoint composite strategies

نویسندگان

  • George Bosilca
  • Aurelien Bouteiller
  • Thomas Herault
  • Yves Robert
  • Jack Dongarra
چکیده

Algorithm-specific fault tolerant approaches promise unparalleled scalability and performance in failure-prone environments. With the advances in the theoretical and practical understanding of algorithmic traits enabling such approaches, a growing number of frequently used algorithms (including all widely used factorization kernels) have been proven capable of such properties. These algorithms provide a temporal section of the execution when the data is protected by it’s own intrinsic properties, and can be algorithmically recomputed without the need of checkpoints. However, while typical scientific applications spend a significant fraction of their execution time in library calls that can be ABFT-protected, they interleave sections that are difficult or even impossible to protect with ABFT. As a consequence, the only fault-tolerance approach that is currently used for these applications is checkpoint/restart. In this paper we propose a model and a simulator to investigate the behavior of a composite protocol, that alternates between ABFT and checkpoint/restart protection for effective protection of each phase of an iterative application composed of ABFT-aware and ABFTunaware sections. We highlight this approach drastically increases the performance delivered by the system, especially at scale, by providing means to rarefy the checkpoints while simultaneously decreasing the volume of data needed to be checkpointed.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Composing resilience techniques: ABFT, periodic and incremental checkpointing

Algorithm Based Fault Tolerant (ABFT) approaches promise unparalleled scalability and performance in failure-prone environments. Thanks to recent advances in the understanding of the involved mechanisms, a growing number of important algorithms (including all widely used factorizations) have been proven ABFT-capable. In the context of larger applications, these algorithms provide a temporal sec...

متن کامل

Assessing the impact of knowledge management strategies on employee innovative behavior in the work place(case study:knowledge-based organizations)

The goal of this research is the evaluation of knowledge management strategies which impact to innovative behavior personnel in workplace. Statistical community includes all human source bosses and assistants in Mazandaran public universities which their numbers were 105 totally. The number of statistical sample was evaluated 82 individuals by using Cochrane equation. To gather data, a question...

متن کامل

Assessing the impact of knowledge management strategies on employee innovative behavior in the work place(case study:knowledge-based organizations)

The goal of this research is the evaluation of knowledge management strategies which impact to innovative behavior personnel in workplace. Statistical community includes all human source bosses and assistants in Mazandaran public universities which their numbers were 105 totally. The number of statistical sample was evaluated 82 individuals by using Cochrane equation. To gather data, a question...

متن کامل

TwinCG: Dual Thread Redundancy with Forward Recovery for Conjugate Gradient Methods

Even though iterative solvers like the Conjugate Gradients method (CG) have been studied for over fifty years, fault tolerance for such solvers has seen much attention in recent years. For iterative solvers, two major reliable strategies of recovery exist: checkpoint-restart for backward recovery, or some type of redundancy technique for forward recovery. Important redundancy techniques like AB...

متن کامل

Attachment-based family therapy and individual emotion-focused therapy for unresolved anger: Qualitative analysis of treatment outcomes and change processes.

Twenty-six clients who received 10 sessions of either attachment-based family therapy (ABFT) or individual emotion-focused therapy (EFT) for unresolved anger toward a parent were interviewed 6 months after completing treatment. Interviews were analyzed using the consensual qualitative research approach. Clients in both conditions reported improved relationships with parents, gaining a new persp...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013